Short reads-based characterization of pathotype diversity and drug resistance among Escherichia coli isolated from patients attending regional referral hospitals in Tanzania

Background Escherichia coli is known to cause about 2 million deaths annually of which diarrhea infection is leading and typically occurs in children under 5 years old. Although Africa is the most affected region there is little information on their pathotypes diversity and their antimicrobial resistance. Objective To determine the pathotype diversity and antimicrobial resistance among E. coli from patients attending regional referral hospitals in Tanzania. Materials and methods A retrospective cross-section laboratory-based study where a total of 138 archived E. coli isolates collected from 2020 to 2021 from selected regional referral hospitals in Tanzania were sequenced using the Illumina Nextseq550 sequencer platform. Analysis of the sequences was done in the CGE tool for the identification of resistance genes and virulence genes. SPSS version 20 was used to summarize data using frequency and proportion. Results Among all 138 sequenced E. coli isolates, the most prevalent observed pathotype virulence genes were of extraintestinal E. coli UPEC fyuA gene 82.6% (114/138) and NMEC irp gene 81.9% (113/138). Most of the E. coli pathotypes observed exist as a hybrid due to gene overlapping, the most prevalent pathotypes observed were NMEC/UPEC hybrid 29.7% (41/138), NMEC/UPEC/EAEC hybrid 26.1% (36/138), NMEC/UPEC/DAEC hybrid 18.1% (25/138) and EAEC 15.2% (21/138). Overall most E. coli carried resistance gene to ampicillin 90.6% (125/138), trimethoprim 85.5% (118/138), tetracycline 79.9% (110/138), ciprofloxacin 76.1% (105/138) and 72.5% (100/138) Nalidixic acid. Hybrid pathotypes were more resistant than non-hybrid pathotypes. Conclusion Whole genome sequencing reveals the presence of hybrid pathotypes with increased drug resistance among E. coli isolated from regional referral hospitals in Tanzania. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-024-01882-y.


Introduction
Escherichia coli is a gram-negative rod-shaped bacterium from the family Enterobacteriaceae known to cause a variety of diseases ranging from diarrheagenic to extraintestinal infections due to their pathogenic effects and productions of various toxins and other virulence factor [1].It is known to cause about 2 million deaths annually as a result of diarrhea infection typically occurring in children under 5 years old and mostly affecting tropical and sub-tropical regions [2].In Dar es Salaam Tanzania, a previous study reported that about 51.1% of UTI cases are caused by E. coli [3].Similarly, in the northern part of Tanzania E. coli account for 46.2% of all UTI cases among children [4].Another study conducted in HIV-infected people in the northern part of Tanzania reported the prevalence of bacteriuria in HIV patients 12.3% whereby 16.2% of the causative agent is E. coli [5].It has also been demonstrated that E. coli causes about 11.7% of the human bacterial infection [6].
World Health Organization reorganizes antimicrobial resistance as among the top 10 global public health threats facing humanity today.Resistance bacterial infections are associated with 4.95 million deaths per year [7].Antimicrobial resistance increases healthcare costs due to more expensive therapy, prolonged hospitalization, and high morbidity rates [8].It is estimated that by 2050, drug resistance will cause more deaths than all cancers combined [9].
Horizontal gene transfer of the resistance and virulence genes among different E. coli increases antimicrobial resistance and creates diversity.In addition, hybrid pathotypes may result from gene overlapping processes [14][15][16].Hybrid pathotypes cause more severe disease and complications.Evidently, a single E. coli has been reported to cause both diarrhea and hemolytic uremic syndrome [17].Another report suggested that 93.5% (101/108) of the E. coli isolates that were originally known as intestinal E. coli carry genes also for extraintestinal E. coli [18].
Despite of the significance of next-generation sequencing in analyzing pathogenic bacteria, their utilization is rare in low-and middle-income countries like Tanzania, leading to a scarcity of information regarding diversity among E. coli pathotypes.Whole genome sequencing (WGS) of the particular pathogen provides precise information on the species identity, resistance gene, plasmid, virulence gene, multi-locus sequence typing (MLST), or serotyping compared to other methods [19,20].Understanding diversity among the pathotypes is important for determining the genetic relatedness and variability among the strains while assessing drug resistance patterns is essential for guiding appropriate treatment strategies.This study aims to determine the level of antimicrobial resistance and pathotype diversity among E. coli isolated from regional referral hospitals in Tanzania.

Study design, study participants, and study sites
This was a retrospective cross-section laboratory-based study using achieved E. coli culture sample isolated from urine, wound, pus, blood, sputum, and stool collected from January 2020 to December 2021 at Regional Referral Hospital of Tanzania.This study was nested from SeqAfrica project with the main objective of developing, expanding, and supporting whole genome sequencing and bioinformatics capacity for antimicrobial resistance surveillance across Africa.The project is functioning in four countries across Africa; Ghana, Nigeria, South Africa, and Tanzania.In Tanzania six study areas were selected; Tabora, Dodoma, Songea, Kigoma, Morogoro, and Zanzibar.

DNA extraction and whole genomic sequencing
DNA from E. coli strains was extracted using Quick-DNA™ Fungal/Bacteria Miniprep Kit as per manufacturer instructions.The purity and the quantity of the extracted DNA were checked by using a Qubit® version 4.0 fluorometer.Library preparation was performed based on the NEBNext® Ultra™ II FS DNA Library Prep Kit manual 2020.Briefly, library preparation involves fragmentation, adaptor ligation, size selection, and indexing or barcoding of each extracted DNA from different E. coli.The prepared library was normalized and combined with Phix control before loading in the Illumina Nextseq550 sequencer platform for sequencing.

Bioinformatics analysis
Quality control of the sequenced raw data was performed using FastQC 0.12.0 [21].De novo assembly was performed using SPAdes 3.15.5 [22] and the final output files were in Fasta format.Bacterial Analysis Pipeline (BAP 3.3.2) which is based on the services available at the Center for Genomic Epidemiology (CGE) (https:// www.genomicepidemiology.org/services/) was used.Species identification was determined using Kmerfinder 3.2 [23][24][25], the Resistance gene was determined using Resfinder 4.1 [26][27][28] and the virulence gene was determined using VirulenceFinder 2.0 [26,29,30].The virulence genes that were used to define E. coli pathotypes are presented in supplementary Table 1.The assembled E. coli genome from this study has been submitted to the European Nucleotide Archive with project accession number PRJEB71714.SPSS version 20 was used to summarize data using frequency and proportion.

Association between E. Coli pathotypes and their antimicrobial resistance profiles
Among all pathotypes, high resistance was observed for the antibiotic ampicillin and trimethoprim in which the hybrid pathotypes were more resistant than non-hybrid pathotypes.All pathotypes were sensitive to the antibiotic meropenem with a slight resistance of 2.4% and 2.8% among the NMEC/UPEC hybrid and NMEC/UPEC/ EAEC hybrid respectively (Table 5).

Discussion
The present study aimed to determine the diversity among the E. coli pathotypes and their associated drug resistance in regional referral hospitals in Tanzania by the use of whole genome sequencing technology.High diversity in the E. coli pathotypes was observed in which extra-intestinal E. coli virulence genes for UPEC and NMEC pathotypes were most prevalent while the least virulence genes were of diarrheagenic E. coli pathotypes (Table 3.).These specific virulence genes together with other factors such as tissue tropisms, pathogenesis, and clinical features as explained by [1,31,32] are what divide E. coli into either intestinal or extraintestinal pathotypes.But also, the highly prevalent UPEC reported in this study might be due to more urine samples that were used in this study.Horizontal gene transfer of the specific pathotyped virulence factor can cause diversity and formation of hybrid pathotypes [15,16].As a result of virulence gene overlapping reported in this study, seven different types of hybrid pathotypes were observed (Table 4).The only pathotypes that were not hybrid were EAEC and UPEC.The combination of virulent factors to form hybrid or hetero-hybrid pathotypes results in more severity of the disease by creating more complications [33].A similar case was also reported by [17] where a single E. coli strain causes both diarrhea and hemolytic uremic syndrome which creates more complications and questions on the best therapy to use.Another study from Norway reported that 93.5% (101/108) of the E. coli isolates that were originally known as intestinal E. coli carry genes also for extra-intestinal E. coli [18].
Resistance gene mcr-1 and blaCTX-M-15-like were also reported in Zanzibar among the sequenced E. coli with a prevalence of 55% and 51.0% respectively.These resistance genes code for colistin and extended-spectrum cephalosporines resistance genes respectively [36,37].Another study conducted in Tanzania also reported that 63% (77/123) of gram-negative bacteria carry resistance gene dfrA of which dfrA1 was frequently in E. coli isolates [38].
Despite the observed high resistance gene among the sequenced E. coli, Meropenem, Amoxicillin-clavunated and Gentamicin antibiotics were less occupied with the resistance gene with the resistance prevalence of 1.4%, 29.7%, and 37.7% respectively.This brings hope for an alternative drug of choice for treatment of the E. coli infection despite the reduced treatment options for E. coli infections.
Furthermore, the resistance gene observed in the present study was observed to cause resistance in more than one antibiotic of either one or different class, this was also reported in other studies such that; the aac(6')-Ib-c gene was reported to cause antibiotic resistance to aminoglycoside antibiotic amikacin and quinolone antibiotic ciprofloxacin [39,40]; gyrA gene mutation can also confer resistance to nalidixic acid and ciprofloxacin antibiotics of which they both belong to quinolone antibiotic group [41,42].The clinical implications for the observed results are that; treatment options and management for the E. coli infection should be reviewed mostly in lowand middle-income countries.Additionally, observed findings suggest the proper use of antibiotics to reduce the development of drug resistance and disease severity.

Conclusion
There is a reduced treatment option for E. coli infections due to increased drug resistance, the only drug with less observed resistance in our setting was meropenem 1.4% a third generation β-lactam antibiotic of the carbapenem class.There is an increase in E. coli pathotype diversity, severity, and complications of the disease due to the large number of observed pathotype hybrid.

Table 1
Distribution of E. coli across the hospitals

Table 2
Antimicrobial resistance genes and predicted phenotypic resistance profiles (N = 138) *AMR gene-Antimicrobial resistance gene

Table 5
E. coli pathotypes with drug resistance